Neural Multigrid

نویسندگان

  • Tsung-Wei Ke
  • Michael Maire
  • Stella X. Yu
چکیده

We propose a multigrid extension of convolutional neural networks (CNNs). Rather than manipulating representations living on a single spatial grid, our network layers operate across scale space, on a pyramid of tensors. They consume multigrid inputs and produce multigrid outputs; convolutional filters themselves have both within-scale and cross-scale extent. This aspect is distinct from simple multiscale designs, which only process the input at different scales. Viewed in terms of information flow, a multigrid network passes messages across a spatial pyramid. As a consequence, receptive field size grows exponentially with depth, facilitating rapid integration of context. Most critically, multigrid structure enables networks to learn internal attention and dynamic routing mechanisms, and use them to accomplish tasks on which modern CNNs fail. Experiments demonstrate wide-ranging performance advantages of multigrid. On CIFAR image classification, flipping from single to multigrid within standard CNN architectures improves accuracy at modest compute and parameter increase. Multigrid is independent of other architectural choices; we show synergistic results in combination with residual connections. On tasks demanding per-pixel output, gains can be substantial. We show dramatic improvement on a synthetic semantic segmentation dataset. Strikingly, we show that relatively shallow multigrid networks can learn to directly perform spatial transformation tasks, where, in contrast, current CNNs fail. Together, our results suggest that continuous evolution of features on a multigrid pyramid could replace virtually all existing CNN designs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Organizing Multi-Resolution Grid for Motion Planning and Control

A fully self-organizing neural network approach to low-dimensional control problems is described. We consider the problem of learning to control an object and solving the path planning problem at the same time. Control is based on the path planning model that follows the gradient of the stationary solution of a diffusion process working in the state space. Previous works are extended by introdu...

متن کامل

A new method based on SOM network to generate coarse meshes for overlapping unstructured multigrid algorithm

A new method to generate coarse meshes for overlapping unstructured multigrid algorithm based on self-organizing map (SOM) neural network is presented in this paper. The application of SOM neural network can overcome some limitations of conventional methods and which is designed to pursuit the best structure relation between fine and coarse unstructured meshes with the object to ensure robust c...

متن کامل

ar X iv : h ep - l at / 9 21 10 31 v 2 2 5 N ov 1 99 2 1 Multigrid meets neural nets

We present evidence that multigrid (MG) works for wave equations in disordered systems, e.g. in the presence of gauge fields, no matter how strong the disorder. We introduce a " neural computations " point of view into large scale simulations: First, the system must learn how to do the simulations efficiently, then do the simulation (fast). The method can also be used to provide smooth interpol...

متن کامل

Parallelizing Over Artificial Neural Network Training Runs with Multigrid

Artificial neural networks are a popular and effective machine learning technique. Great progress has been made on speeding up the expensive training phase of a neural networks, leading to highly specialized pieces of hardware, many based on GPU-type architectures, and more concurrent algorithms such as synthetic gradients. However, the training phase continues to be a bottleneck, where the tra...

متن کامل

Learning across scales - A multiscale method for Convolution Neural Networks

In this work we explore the connection between Convolution Neural Networks, partial differential equations, multigrid/multiscale methods and and optimal control. We show that convolution neural networks can be represented as a discretization of nonlinear partial differential equations, and that the learning process can be interpreted as a control problem where we attempt to estimate the coeffic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1611.07661  شماره 

صفحات  -

تاریخ انتشار 2016